Search results for "Database Management Systems"
showing 10 items of 15 documents
epiPATH: an information system for the storage and management of molecular epidemiology data from infectious pathogens.
2007
Abstract Background Most research scientists working in the fields of molecular epidemiology, population and evolutionary genetics are confronted with the management of large volumes of data. Moreover, the data used in studies of infectious diseases are complex and usually derive from different institutions such as hospitals or laboratories. Since no public database scheme incorporating clinical and epidemiological information about patients and molecular information about pathogens is currently available, we have developed an information system, composed by a main database and a web-based interface, which integrates both types of data and satisfies requirements of good organization, simple…
Distributed image retrieval on DAISY
2006
The paper describes an application of image retrieval based on DAISY architecture (distributed architecture for intelligent system). The creation of pictorial indexes may require a number of hours depending on the size of the pictorial data base. The problem can become more complex in the case of distributed database systems. In both cases a distributed architecture can be the natural and more efficient solution. DAISY architecture is based on the concept of co-operating behavioral agents supervised by a central engagement module. Preliminary experiments, to evaluate the performance of the system, have been performed on a astronomical database and coral image
PASSIM – an open source software system for managing information in biomedical studies
2007
Abstract Background One of the crucial aspects of day-to-day laboratory information management is collection, storage and retrieval of information about research subjects and biomedical samples. An efficient link between sample data and experiment results is absolutely imperative for a successful outcome of a biomedical study. Currently available software solutions are largely limited to large-scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but often implies sufficient investment of time, effort and funds, which are not always available. There is a clear need for lightweig…
FASTdoop: A versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications
2017
Abstract Summary MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers…
Enhancing Privacy and Authorization Control Scalability in the Grid through Ontologies
2009
The use of data Grids for sharing relevant data has proven to be successful in many research disciplines. However, the use of these environments when personal data are involved (such as in health) is reduced due to its lack of trust. There are many approaches that provide encrypted storages and key shares to prevent the access from unauthorized users. However, these approaches are additional layers that should be managed along with the authorization policies. We present in this paper a privacy-enhancing technique that uses encryption and relates to the structure of the data and their organizations, providing a natural way to propagate authorization and also a framework that fits with many u…
Controlling false match rates in record linkage using extreme value theory
2011
AbstractCleansing data from synonyms and homonyms is a relevant task in fields where high quality of data is crucial, for example in disease registries and medical research networks. Record linkage provides methods for minimizing synonym and homonym errors thereby improving data quality. We focus our attention to the case of homonym errors (in the following denoted as ‘false matches’), in which records belonging to different entities are wrongly classified as equal. Synonym errors (‘false non-matches’) occur when a single entity maps to multiple records in the linkage result. They are not considered in this study because in our application domain they are not as crucial as false matches. Fa…
CoCoDat: a database system for organizing and selecting quantitative data on single neurons and neuronal microcircuitry.
2004
We present a novel database system for organizing and selecting quantitative experimental data on single neurons and neuronal microcircuitry that has proven useful for reference-keeping, experimental planning and computational modelling. Building on our previous experience with large neuroscientific databases, the system takes into account the diversity and method-dependence of single cell and microcircuitry data and provides tools for entering and retrieving published data without a priori interpretation or summarizing. Data representation is based on the framework suggested by biophysical theory and enables flexible combinations of data on membrane conductances, ionic and synaptic current…
EVpedia: a community web portal for extracellular vesicles research
2014
Abstract Motivation: Extracellular vesicles (EVs) are spherical bilayered proteolipids, harboring various bioactive molecules. Due to the complexity of the vesicular nomenclatures and components, online searches for EV-related publications and vesicular components are currently challenging. Results: We present an improved version of EVpedia, a public database for EVs research. This community web portal contains a database of publications and vesicular components, identification of orthologous vesicular components, bioinformatic tools and a personalized function. EVpedia includes 6879 publications, 172 080 vesicular components from 263 high-throughput datasets, and has been accessed more tha…
BlotBase: a northern blot database.
2008
With the availability of high-throughput gene expression analysis, multiple public expression databases emerged, mostly based on microarray expression data. Although these databases are of significant biomedical value, they do hold significant drawbacks, especially concerning the reliability of single gene expression profiles obtained by microarray data. Simultaneously, reliable data on an individual gene's expression are often published as single northern blots in individual publications. These data were not yet available for high-throughput screening. To reduce the gap between high-throughput expression data and individual highly reliable expression data, we designed a novel database "Blo…
GPCALMA: A Grid-based tool for mammographic screening
2005
The next generation of High Energy Physics (HEP) experiments requires a GRID approach to a distributed computing system and the associated data management: the key concept is the Virtual Organisation (VO), a group of distributed users with a common goal and the will to share their resources. A similar approach is being applied to a group of Hospitals which joined the GPCALMA project (Grid Platform for Computer Assisted Library for MAmmography), which will allow common screening programs for early diagnosis of breast and, in the future, lung cancer. HEP techniques come into play in writing the application code, which makes use of neural networks for the image analysis and proved to be useful…